GERSCHGORIN'S THEOREM FOR GENERALIZED 
EIGENVALUE PROBLEMS IN THE EUCLIDEAN METRIC 
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Abstract. We present Gerschgorin-type eigenvalue inclusion sets applicable 
to generalized eigenvalue problems. Our sets are defined by circles in the 
complex plane in the standard Euclidean metric, and are easier to compute 
than known similar results. As one application we use our results to provide 
a forward error analysis for a computed eigenvalue of a diagonalizable pencil. 



1. Introduction 

For a standard eigenvalue problem Ax = Xx where A G C rlXTl , Gcrschgorin's 
theorem j8] defines in the complex plane a union of n disks that contains all the 
n eigenvalues. Its simple exposition and applicability make it an extremely useful 
tool in estimating eigenvalue bounds. It also plays an important role in eigenvalue 
perturbation theory [T2l [TP] . 

The generalized eigenvalue problem Ax = XBx where A, B 6 C™ xn also arises in 
many scientific applications. It should be useful to have available a similar simple 
theory to estimate the eigenvalues for this type of problems as well. 

In fact, Stewart and Sun [HI [10] provide an eigenvalue inclusion set applicable to 
generalized eigenvalue problems. The set is the union of n regions defined by 

(1.1) G l {A,B) = {z e C : x(z,<H,i/lH,i) < ft}, 

where 



(1.2) ft = 



\ |a M | 2 + |6 M | 2 



All the eigenvalues of the pencil A — XB lie in the union of Gi(A, B), i.e., if A is an 
eigenvalue, then 

n 

X e G(A, B) = \jGi(A,B). 

i=l 

Note that A can be infinite. We briefly review the definition of eigenvalues of a 
pencil at the beginning of section [2j 

The region (|1.1| is defined in terms of the chordal metric %, defined by [H Ch.7.7] 

x(x,y) - 



Date: August 9, 2010. 

2000 Mathematics Subject Classification. Primary 15A22, 15A42, 65F15. 

Key words and phrases. Gerschgorin's theorem, generalized eigenvalue problems, Euclidean 
metric, forward error analysis. 

1 



2 



YUJI NAKATSUKASA 



The justification of using the chordal metric instead of the more standard Euclidean 
metric is in the unifying treatment of finite and infinite eigenvalues 1 10] . The use 
of the chordal metric has thus become a common practice in perturbation analyses 
for generalized eigenvalue problems, and some recent results [2 [5] are presented in 
terms of this metric. 

However, using the chordal metric makes the application of the theory less in- 
tuitive and usually more complicated. In particular, interpreting the set G in the 
Euclidean metric is a difficult task, as opposed to the the Gerschgorin set for stan- 
dard eigenvalue problems, which is defined as a union of n disks. Another caveat of 
using G is that it is not clear whether the region G will give a nontrivial estimate 
of the eigenvalues. Specifically, since any two points in the complex plane have 
distance smaller than 1 in the chordal metric, if there exists i such that Qi > 1, 
then G is the whole complex plane, providing no information. In view of (|1.2[) . it 
follows that G is useful only when both A and B have small off-diagonal elements. 

Another Gcrschgorin-type eigenvalue localization theory applicable to general- 
ized eigenvalue problems appear in a recent paper [4] by Kostic et al. Their inclusion 
set is defined by 



and all the eigenvalues of the pencil A—XB exist in the union K(A, B) = 1J™ =1 Ki(A, B). 
This set is defined in the Euclidean metric, and (|1.3|) shows that K(A, B) is a com- 
pact set in the complex plane C if and only if B is strictly diagonally dominant. 
However, the set ()1.3|) is in general a complicated region, which makes its practical 
application difficult. 

The goal of this paper is to present a different generalization of Gcrschgorin's 
theorem applicable to generalized eigenvalue problems, which solves the issues men- 
tioned above. In brief, our eigenvalue inclusion sets have the following properties: 

• They involve only circles in the Euclidean complex plane, using the same 
information as does. Therefore it is simple to compute and visualize. 

• They are defined in the Euclidean metric, but still deal with finite and 
infinite eigenvalues uniformly. 

• One variant V s (A, B) is a union of n disks when B is strictly diagonally 
dominant. 

• Comparison with G(A, B): Our results arc defined in the Euclidean metric. 
Tightness is incomparable, but our results are tighter when B is close to a 
diagonal matrix. 

• Comparison with K (A, B): Our results are defined by circles and are much 
simpler. K{A, B) is always tighter, but our results approach K(A, B) when 
B is close to a diagonal matrix. 

In summary, our results provide a method for estimating eigenvalues of (A, B) in a 
much cheaper way than the two known results do. 

The structure of the paper is as follows. In section [2] we describe our idea 
and derive our main Gerschgorin theorems for generalized eigenvalue problems. 
Simple examples and plots are shown in section [3] to illustrate the properties of 
different regions. Section 4 presents one application of our results, where we develop 



(1.3) 
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forward error analyses for the computed eigenvalues of a non-Hermitian generalized 
eigenvalue problem. 



In this section we develop our Gcrschgorin theorem and its variants. First the 
basic idea for bounding the eigenvalue location is discussed. Section 12.21 presents 
a simple bound and derives our first Gcrschgorin theorem. In section 12.31 we carry 
out a more careful analysis and obtain a tighter result. In section [2^41 we show that 
our results can localize a specific number of eigenvalues, a well-known property of 
G and the Gcrschgorin set for standard eigenvalue problems. 

As a brief summary of the eigenvalues of a pencil A — XB where A, B £ C" xn , A 
is a finite eigenvalue of the pencil if det(A — XB) = 0, and in this case there exists 
nonzero x £ C™ such that Ax = XBx. If the degree of the characteristic polynomial 
det(^4 — XB) is d < n, then we say the pencil has n — d infinite eigenvalues. In 
this case, there exists a nonzero vector x £ C™ such that Bx = 0. When B is 
nonsingular, the pencil has n finite eigenvalues, matching those of B~ 1 A. 

Throughout the paper we assume that for each i S {1, 2, • ■ ■ ,n}, the ith row of 
cither A or B is strictly diagonally dominant, unless otherwise mentioned. Although 
this may seem a rather restrictive assumption, its justification is the observation 
that the set G(A, B) is always the entire complex plane unless this assumption is 
true. 

2.1. Idea. Suppose Ax = XBx (we consider the case A = oo later). We write 
x = (xi,X2, • • • , x n ) T and denote by a Pi g and b Piq the (p, q)th element of A and B 
respectively. Denote by i the integer such that \xt \ = maxi<j<„ \xj\, so that Xi ^ 0. 
First we consider the case where the ith row of B is strictly diagonally dominant, 
so \bi y i\ > Yljjii From the zth equation of Ax = XBx we have 

(2.1) a M cc ? + y^ ajjXj = X{b iti Xi + ^ h,jXj)- 



2. Main Gerschgorin theorems 




Dividing both sides by Xi and rearranging yields 




- a. 




(2.2) 




where we write Ri — Ylj-j-j \ a i,j\- The last inequality holds because \xj\ < \xi\ for 
all j. Here, using the assumption > I^jIj we nave 




> \h,i\ 



Em >0 ' 
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where we used \xj\ < \xi\ again. Hence we can divide (|2.2|) by 
which yields 



E 



(2.3) 



A 



< 



Ri 



E, 



Now, writing 7j = (Ejjii h,j^:)/bi,i, we have < \ b i,j\/\bi,i\ (= n) < 1, 

and (|2.3p becomes 



(2.4) 



A 



6i i 



1 



< 



1 



Our interpretation of this inequality is as follows: A lies in the disk of radius 
Ri/\bi,i\\l + 7j| centered at i(l + 7i)i defined in the complex plane. Unfortu- 

nately the exact value of 7$ is unknown, so we cannot specify the disk. Fortunately, 
we show in section [2. 2 1 that using 7^ < r, we can obtain a region that contains all 
the disks defined by (|2.4[) for any 7, such that |7$| < r». 

Before we go on to analyze the inequality (|2.4p . let us consider the case where 
the ith row of A is strictly diagonally dominant. As wc will sec, this also lets us 
treat infinite eigenvalues. 

Recall (|2.ip . We first note that if \xi\ = maxj \xj\ and the ith row of A is 
strictly diagonally dominant, then A 7^ 0, because |oi,»:Ci+Ejati a ij' a; il — l a «, 



Ej>H KjIN > M(K;| - Ej^ 
with the equation 



> 0. Therefore, in place of (|2.ip we start 



I 

A 



Note that this expression includes the case A = 00, because then the equation 
becomes -Ba; = 0. Following the same analysis as above, we arrive at the inequality 
corresponding to f|2.4[) : 



(2.5) 



f 



< 



Ri 



1 



where we write Rf = \ b i,j\ and 7^ = (Ej^i a i,j^)/ a i,i- N °tc that 

hf~\ — Y^j^i \ a i,j\/\ a iA (= r f) < 1- Therefore we are in an essentially same 

situation as in (|2.4p . the only difference being that we are bounding 1/A instead of 

A. 

In summary, in both cases the problem boils down to finding a region that 
contains all z such that 

(2-6) 



1 + 7 



< 



|1 + 7 | 



where s G C, t > arc known and < r < I is known such that I7I < r. 

2.2. Gerschgorin theorem. First we bound the right-hand side of (|2.6[) . This 
can be done simply by 

t t t 



(2.7) 



II 



< 



■71 



l-M 



< 



f 
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Next we consider a region that contains all the possible centers of the disk (|2.4p . 
We use the following result. 

Lemma 2.1. If\j\ < r < 1, then the point 1/(1+7) ^ es * n ^ e disk in the complex 
plane of radius r/(l — r) centered at 1. 



Proof. 



1 + 7 



- 1 





7 




1 + 7 


< 


r 


l-M 


< 


r 


1 -r' 



□ 

In view of (|2.6[) , this means that s/(l + 7), the center of the disk (|2.6[) . has to 
lie in the disk of radius sr/(l — r) centered at s. Combining this and (|2.7p . we 

conclude that z that satisfies (I2.6P is included in the disk of radius 1 , 

1 — r 1 — r 

centered at s. 

Using this for fj2.4[) by letting s = \ai,i\/\bi t i\, t = Ri/\b^i\ and r = rj, we sec that 
A that satisfies (|2.4p is necessarily included in the disk centered at a^i/b^i, and of 
radius 

_ Kij n Rj 1 _ [a^ilrj + Rj 
* \h,i\l-n \b i , l \l-r l \bi,i\(l - n)' 
Similarly, applying the result to ()2.5p . we see that 1/A satisfying (|2.5|) has to satisfy 



(2.8) 

This is equivalent to 



If b, 



0, this becomes 



\ (H,i 

- Afc M 



< 



< 



Ik 



(1 



7^ tt|A| > |ffli,i|, which is 

(1 - rf ) 



>^#(l-rf)when 



R 



Rf + 1 0. If bi_i = Rf = 0, no finite A satisfies the inequality, so we say the point 
A = 00 includes the inequality. 
If bi t i ^ 0, we have 



(2.9) 



A 



at 
b, 



< 



Ri 



\hi\(l-rf) 



|A|. 



For simplicity, we write this inequality as 

(2.10) |A-Oi|<A|A|, 



where on 



and Pi 



Ri 



Ki \bi,i\{l-rf) 



> 0. Notice that the equality of (|2~T0l) 



holds on a certain circle of Apollonius [6l sec. 2], defined by |A — «j| = Pi 
easy to see that the radius of the Apollonius circle is 



It is 
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and the center is 

1 



2 VI + ft 1-/3? 
From (|2.10p we observe the following. The Apollonius circle divides the complex 
plane into two regions, and A exists in the region that contains oli — cii^/bi^. 
Consequently, A lies outside the circle of Apollonius when ft > 1, and inside it 
when /3i < 1. When /3j = 1, the Apollonius circle is the perpendicular bisector of 
the line that connects at and 0, dividing the complex plane into halves. 
The above arguments motivate the following definition. 

Definition 2.2. For n x n complex matrices A and B, denote by S B (and S A ) 
the set of i £ {1, 2, • • • , n} such that the ith row of B (A) is strictly diagonally 
dominant. 

For ieS fl , define the disk Yf{A, B) by 



(2.11) Yf{A,B) = {ze 



<Pi\ (i = l,2,--- ,ri), 



where denoting = |t^4( < ■"•) anc ^ ^» = l a i,jli tnc radii pi are defined by 



_ \ai,i\ r i + Ri 

Pl ~ \bi,i\(l-nY 

For i $l S B , we set rf (A, £?) = C, the whole complex plane. 

We also define T?(A, B) by the following. For i e S A , denote rf = V |^4(< 

l)and Rf = ^2 \b itj \. 

If = R A = 0, define rf (A, B) = {oo}, the point z = oo. If = and 
i?, A > 0, define = [z e C : |z| > ^(1 -rf )}. 



.1 



For 6j j ^ 0, denoting on = — L - and /3, 



• If ft < 1, then define 

(2.12) rftA,.B) = {*eC:|z-C(|<pj 1 } > 

, on A | a»| ft 

where c, = and ft = ¥ Z^y 

• If ft > 1, then define 

(2.13) r A (A,B) = {zGC:\z-c i \>pf}, 

• If ft = 1, then define 

(2.14) T?(A,B) = {zeC: \z - cn\ < \z\} . 
Finally for i f S A , we set T A (A, B) = C. 



Note that T A (A, B) in (|2~13| and (|2~T4|) contains the point {oo}. 
We now present our eigenvalue localization theorem. 
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Theorem 2.3 (Gerschgorin-type theorem for generalized eigenvalue problems). 
Let A, B be n x n complex matrices. 

All the eigenvalues of the pencil A — XB lie in the union of n regions Ti(A, B) 
in the complex plane defined by 

(2.15) Ti(A,B)=Tf(A,B)nrf(A,B). 
In other words, if A is an eigenvalue of the pencil, then 

A e T(A, B) = |J Ti(A, B). 

l<i<n 

Proof. First consider the case where A is a finite eigenvalue, so that Ax = XBx. 
The above arguments show that A 6 B) for i such that \x%\ = maxj \xj\. 

Similarly, in the infinite eigenvalue case A = oo, let Bx = 0. Note that the 
ith row (such that |) of B cannot be strictly diagonally dominant, 

because if it is, then + Kj x j\ - - > \xi\(\bij\ - 

Ej^ilhA) > 0- Therefore, Vf{A,B) = C, so Ti(A,B) = Tf(A,B). Here if 
i £ S A , then Ti(A, B) = C, so A G T(A,B) is trivial. Therefore we consider the 
case i e S A . Note that the fact that B is not strictly diagonally dominant implies 

fr. A r A -|- 

\bn\ < Rf, which in turn means ft > 1, because recalling that ft = xr, 

IM(l-Ti) 

we have 

\h4rt + Rf - - rf) = Ib^rf 1) + Rf > 2rf\b i , i \ > 0. 

Hence, recalling (|2 . 1 3[1 we see that oo 6 Tf(A,B). 

Therefore, any eigenvalue of the pencil lies in Ti{A,B) for some i, so all the 
eigenvalues lie in the union Ui<i< ra ^iiA, B). □ 



Theorem 12.31 shares the properties with the standard Gerschgorin theorem that 
it is an eigenvalue inclusion set that is easy to compute, and the boundaries are 
denned as circles (except for Tf{A,B) for the special case ft = 1). One difference 
between the two is that Theorem 12 .31 involves n + m circles, where m is the number 
of rows for which both A and B are strictly diagonally dominant. By contrast, 
the standard Gerschgorin always needs n circles. Also, when B — > /, the set does 
not become the standard Gerschgorin set, but rather becomes a slightly tighter set 
(owing to Ff(A, B)). Although these are not serious defects of out set T(A, B), the 
following simplified variant solves the two issues. 

Definition 2.4. We use the notations in Definition 12.21 For i <E S B , define 
Tf(A,B) by Tf(A,B) = Tf{A,B). For i $ S B , define Tf(A,B) = Tf{A,B). 

Corollary 2.5. Let A,B be n x n complex matrices. All the eigenvalues of the 
pencil A - \B lie in V s (A, B) = |J Tf(A,B). 

l<i<n 

Proof. It is easy to sec that Ti(A,B) C Tf(A, B) for all i. Using Theorem 1231 the 
conclusion follows immediately. □ 



As a special case, this result becomes a union of n disks when B is strictly 
diagonally dominant. 



s 
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Corollary 2.6. Let A, B be n x n complex matrices, and let B be strictly diagonally 
dominant. Then, Tf(A,B) = Tf (A, B), and denoting by Ax,-- - , A„ the n finite 
eigenvalues of the pencil A — XB, 

XeT s (A,B)= |J rf(A,B). 

l<i<n 

Proof. The fact that Tf(A,B) = Tf{A,B) follows immediately from the diagonal 
dominance of B. The diagonal dominance of B also forces it to be nonsingular, so 
that the pencil A — XB has n finite eigenvalues. □ 

Several points are worth noting regarding the above results. 

• V s (A, B) in Corollaries 12.51 and 12.61 is defined by n circles. Moreover, it is 
easy to see that V s (A, B) reduces to the original Gerschgorin theorem by 
letting B = I. In this respect V s (A, B) might be considered a more natu- 
ral generalization of the standard Gerschgorin theorem than T(A,B). We 
note that these properties are shared by K(A, B) in (|1.3|) but not shared 
by G(A,B) in which is defined by n regions, but not circles in the 
Euclidean metric, and is not equivalent to (always worse, see below) the 
standard Gerschgorin set when B = I. T(A, B) also shares with K(A,B) 
the property that it is a compact set in C if and only if B is strictly diago- 
nally dominant, as mentioned in Theorem 8 in [4]. 

• K(A,B) is always included in T(A,B). To see this, suppose that z £ 
Ki(A, B) so \b iti z - a ui \ < \ b i,j z ~ a i,j\- Tncn for °> ( notc tnat 
rf(A, B) = C so trivially z £ T?(A, B) if b hi = 0) 

(- ,n "gfcf) 

Since we can write \z — a.^i\ — ri\z\ = \z — a^i + r\e lS z\ for some 9 € [0, 2ir], 
it follows that if z £ K { (A, B) then 

z(l + r t e w ) - — 
Since n < 1, we can divide this by (1 + ne 10 ), which yields 

R, 1 

" |M \l + ne* s \' 

Note that this becomes (|2.4[) if we substitute ji into r^e and A into z. Now, 
since Tf(A,B) is derived from (|2.4[) by considering a disk that contains A 
that satisfies (|2.4j) for any ji such that < n, it follows that z that 
satisfies (|2. 16|) is included in Tf(A, B). By a similar argument we can 
prove z G Ki(A, B) z e Tf (A, B), so the claim is proved. 



< 



(2.16) 



1 



ba 1 + ne iB 
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Although K(A,B) is always sharper than T(A, B) is, T(A,B) has the ob- 
vious advantage over K (A, B) in its practicality. T(A, B) is much easier to 
compute than K(A,B), which is generally a union of complicated regions. 
It is also easy to see that T(A,B) approaches K(A,B) as B approaches a 
diagonal matrix, see examples in section [3l T(A 7 B) sacrifices some tight- 
ness for the sake of simplicity. For instance, K(A, B) is difficult to use for 
the analysis in section |U 

G(A,B) and T(A,B) are generally not comparable, see the examples in 
section [31 However, we can see that Ti(A,B) is a nontrivial set in the 
complex plane C whenever Gi(A, B) is, but the contrary does not hold. 
This can be verified by the following. Suppose Gi (A, B) is a nontrivial set 
in C, which means (V^j^i l a *j|) 2 + (X^j \ b i,j\) 2 < I°m| 2 + I^mI 2 - Tnis 
is true only if J2 3 ^i \ a i,j\ < or Sj^i < I^mI> so tnc * tn row of 
at least one of A and B has to be strictly diagonally dominant. Hence, 
Ti(A, B) is a nontrivial subset of C. 

To see the contrary is not true, consider the pencil 



(2.17) 



A x - XBx = 



2 3 

3 2 



- A 



2 1 
1 2 



which has eigenvalues —1 and 5/3. T(Ax,Bx) for this pencil is T(Ax, Bi) = 

{z e C : \z- 1| < 4}. In contrast, G(Ax,B x ) \sG(Ax,Bx) = [z e C : x(A, 1) < VWs} , 

which is useless because the chordal radius is larger than 1. 

When B ~ /, T(A, B) is always a tighter region than G(A, B) is, because 

Gi(A,I) is 



< 



\ 



■ ■ |A-a M | < y/l + \\\ 2 Ri, 
whereas Tf(A,I) is the standard Gerschgorin set 

A flj j ^ Rj , 

from which rf (A, B) C Gj(vl,7) follows trivially. 



2.3. A tighter result. Here we show that we can obtain a slightly tighter eigen- 
value inclusion set by bounding the center of the disk (|2.6p more carefully. Instead 
of Lemma T2.1[ we use the following two results. 

Lemma 2.7. The point 1/(1 +re ) where r > and # £ [0,27r] Zies on a circle o/ 
radius r/(l — r 2 ) centered at 1/(1 — r 2 ). 
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Proof. 



1 1 




(1-r 2 


)-(l- 


\~ re 








1 + re ie 1-r 2 




(1 + re tS )(l 


_ r 2) 










r(r + e %s ) 














(1 +re i9 )(l - 


r 2) 












re %e {l + re~ 














(l + re ie )(l - 


r 2 ) 












r 


• (■■ 


1 + 


re 


-ie 








1-r 2 


H 


- re 





1) 



□ 



Lemma 2.8. Denote by M(r) the disk of radius r/(l — r 2 ) centered at 1/(1 — r 2 ). 
7/0 < r' < r < 1 i/ien M(r') C M(r). 

Proof. We prove by showing that z € M(r') =>• z £ M(r). Suppose 2 E M(r'). z 



satisfies 



l 

l-(r') 2 



< 



1-r 2 



l-(r') a 
< 



SO 



1 



< 



1 - (r') 2 
' + 



1 



1 - (r') 2 
r 2 - {r'f 



1-r 2 



(1 - (r') 2 )(l -r 2 ) 



>\2 



_ r'(l-r 2 ) + r 2 -(r') 
~ (1 - (r') 2 )(l -r 2 ) ' 
Here, the right-hand side is smaller than r/(l — r 2 ), because 

r r'(l-r 2 )+r 2 -(r') 2 r(l - (r') 2 ) - (r'(l - r 2 ) + r 2 - (r') 2 ) 



1-r 2 (1 - (r') 2 )(l -r 2 ) (1 - (r') 2 )(l - r 2 ) 

(l-r)(l-r')(r-r') 



(l-(r') 2 )(l-r 2 ) 



> 0. 



Hence 



l 



< 



t_T> so z £ M(r). Since the above argument holds for any 
z e M(r'), M(r') C M(r) is proved. □ 

The implication ol these two Lemmas applied to (|2.6p is that the center s/(l +7) 

lies in sM(r). Therefore we conclude that z that satisfies (|2.6p is included in the 

s sr t 

disk centered at ^ . and of radius 



1-r 2 



1 — r 2 1 — r 



Therefore, it follows that A that satisfies (|2.4|) lies in the disk of radius 



centered at 



\aj.iVi + -Rj(l + Tj) 
1^1(1 -r 2 ) ; 



Similarly, we can conclude that 1/A that satisfies (|2.5|) has to satisfy 



(2.18) 



1 



A 



1 - (rf) 2 



< 



\b ht \rf + Rf(l + rf) 



Ki\(l-(rf) 2 ) ' 

Recalling the analysis that derives (|2.10[) . we see that when b^i ^ 0, this inequality 
is equivalent to 

(2.19) <ft|A|, 



GERSCHGORIN'S THEOREM FOR GENERALIZED EIGENVALUE PROBLEMS 11 

where &i = ai,i(l - (rf) 2 )/b iih fa = rf + R A (1 + )/|6 M |. 

The equality of (|2.19[1 holds on an Apollonius circle, whose radius is p A = 

\oti\h , &i 
= — , and center is c, = — . 

The above analyses leads to the following definition, analogous to that in Defi- 
nition 12.21 

Definition 2.9. We use the same notations S, S A , r,, Ri,rf , R A as in Definition 

3 

For ie S B , define the disk ff by 



(2.20) Tf(A,B) = izGC: 
where the radii pi are defined by 



a 

z 



bi ti I - fa) 2 
\aj,i\ r i + 



<Pi\ (i= 1,2,- ■■ ,n), 



For i£ S B , we set ff (A, B) = C. 

Tf(A,B) is defined by the following. For i G S A and 6 i:i ^ 0, denote 

a M/-, / An2n 3 A , - R l A ( 1 + r l A ) - «« j -A l a »|ft 

"i = i— (1 - fa ) ), ft = rf + it i ,Ci = - m and Pi ' 



Then, T A (A,B) is defined similarly to T A (A,B) (by replacing a*, fa, a, pf with 
&i,fa,Ci,pf respectively in (|2.12|) - l|2.14|) ). depending on whether fa > I, fa < 1 or 
fa = l- 

When = or i £ S A , ff(A, B) = T A {A, B) defined in Definition E21 

Thus we arrive at a slightly tighter Gerschgorin theorem. 

Theorem 2.10 (Tighter Gerschgorin-type theorem). Let A,B be n x n complex 
matrices. 

All the eigenvalues of the pencil A — XB lie in the union of n regions Ti(A,B) 
in the complex plane defined by 

(2.21) fi(A, B) ee ff (A, B) n T A (A, B). 

In other words, if A is an eigenvalue of the pencil, then 

Xef{A,B)= |J Ti(A,B). 

l<i<n 

The proof is the same as the one for Theorem l2.3l and is omitted. The simplified 
results of Theorem 12.101 analogous to Corollaries 12.51 and 12.61 can also be derived 
but is omitted. 

It is easy to see that Ti(A, B) C Ti(A, B) for all i, so f (A, B) is a sharper eigen- 
value bound than T{A, B). For example, for the pencil (|2.17j) . we have Ti{A\,B{) = 
{zeC: | z — || < -y}. We can also see that f (A,B) shares all the properties men- 
tioned at the end of section l2~2l The reason we presented T(A, B) although f (A, B) 
is always tighter is that T(A, B) has centers a^i/b^i, which may make it simpler to 
apply than f (A, B). In fact, in the analysis in section[3]we only use Theorem[ 
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2.4. Localizing a specific number of eigenvalues. We are sometimes interested 
not only in the eigenvalue bounds, but also in the number of eigenvalues included 
in a certain region. The classical Gcrschgorin theorem serves this need [llj , which 
has the property that if a region contains exactly k Gerschgorin disks and is disjoint 
from the other disks, then it contains exactly m eigenvalues. This fact is used to 
derive a perturbation result for simple eigenvalues in [J3]. An analogous result 
holds for the set G(A,B) [TUl Ch.5]. Here we show that our Gerschgorin set also 
possesses the same property. 

Theorem 2.11. // a union of k Gerschgorin regions Ti(A,B) (or Ti(A,B)) in 
the above Theorems fTheorem \2.SX \2.10\ or Corollary \2.5[ \2~B\) is disjoint from the 
remaining n — k regions and is not the entire complex plane C ; then exactly k 
eigenvalues of the pencil A — XB lie in the union. 

Proof. We prove the result for Ti(A, B). The other sets can be treated in an entirely 
identical way. 

We use the same trick used for proving the analogous result for the set G(A, B), 
shown in [TQl Ch.5]. Let A = diag(an, a 2 2, • • • ,a nn ),B = diag(6u, 622, • • • ,b nn ) 
and define 

A(t) = A + t(A- A), B(t) = B + t(B - B). 

It is easy to see that the Gerschgorin disks Ti(A(t), B(t)) get enlarged as t increases 
from to 1. 

In [10] it is shown in the chordal metric that the eigenvalues of a regular pencil 
A — XB are continuous functions of the elements provided that the pencil is regular. 

Note that each of the regions Ti(A(t), B{t)) is a closed and bounded subset of C 
in the chordal metric, and that if a union of k regions Ti(A(t), B(t)) is disjoint from 
the other n — k regions in the Euclidean metric, then this disjointness holds also in 
the chordal metric. Therefore, if the pencil A(t) — XB(t) is regular for < t < 1, then 
an eigenvalue that is included in a certain union of k disks {J 1<i<k Ti(A(t), B(t)) 
cannot jump to another disjoint region as t increases, so the claim is proved. Hence 
it suffices to prove that the pencil A(t) — XB(t) is regular. 

The regularity is proved by contradiction. If A(t) ~ XB(t) is singular for some 
< t < 1, then any point z g C is an eigenvalue of the pencil. However, the 
disjointness assumption implies that there must exist a point z' EC such that z' 
lies in none of the Gerschgorin disks, so z' cannot be an eigenvalue. Therefore, 
A(t) — XB(t) is necessarily regular for < t < 1. 

□ 



3. Examples 

Here we show some examples to illustrate the regions we discussed above. As 
test matrices we consider the simple pencil A — XB G C" xn where 

/4 a \ Mb \ 



(3.1) 



.4 



V 



a 

a 4/ 



and B 



V 



'■• b 

b 4/ 



Note that T(A, B), T(A, B) and K(A, B) are nontrivial regions if b < 2, and G(A, B) 
is nontrivial only if a 2 + b 2 < 8. Figure [JJ shows our results T(A, B) and T(A, B) for 
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different parameters (a, b). The two crossed points indicate the smailest and largest 
eigenvalues of the pencil (|3.1|) when the matrix size is n = 100. For (a, b) = (1, 2) 
the largest eigenvalue (not shown) was ~ 1034. Note that T(A,B) = T B (A,B) 
when (a, b) = (2, 1) and T(A, B) = B) when (a, b) = (1, 2). 




"2 2 -2 2 4 -1 == 0,1 2 




FIGURE 1. Plots of T(A, B) and T(A, B) for matrices (|3~T1) with 
different a, 6. 

The purpose of the figures below is to compare our results with the known results 
G(A,B) and K(A,B). As for our results we only show V s (A, B) for simplicity. 
Figure [2] compares V s (A, B) with G(A,B). We observe that in the cases (a,b) = 
(2, 1), (3, 1), T(A, B) is a much more useful set than G(A, B) is, which in the latter 
case is the whole complex plane. This reflects the observation given in section \2. 21 
that r(j4, B) is always tighter when B ~ /. 




Figure 2. Plots of V s (A, B) and G(A, B) for matrices ([3T|) with 
different a, b. 

Figure [3] compares V s (A, B) with K(A,B), in which the boundary of V s (A, B) 
is shown as dashed circles. We verify the relation K(A, B) C T(A,B) C T(A,B). 
These three sets become equivalent when B is nearly diagonal, as shown in the 
middle graph. The right graph shows the regions for the matrix defined in Example 
1 in in which all the eigenvalues are shown as crossed points. 
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MH-y) (a, b) = (-1,0.1) 




Figure 3. Plots of T(A, B) and K(A,B) 



We emphasize that our result T(A, B) is defined by circles and so is easy to plot, 
while the regions K(A,B) and G(A,B) are generally complicated regions, and are 
difficult to plot. In the above figures we obtained K (A, B) and G(A, B) by a very 
naive method, i.e., by dividing the complex plane into small regions and testing 
whether the center of the region is contained in each set. 



4. Application to forward error analysis 



The Gerschgorin theorems presented in section[5]can be used in a straightforward 
way for a matrix pencil with some diagonal dominance property whenever one wants 
a simple estimate for the eigenvalues or bounds for the extremal eigenvalues, as the 
standard Gerschgorin theorem is used for standard eigenvalue problems. 

Here we show how our results can also be used to provide a forward error analysis 
for computed eigenvalues of a diagonalizable pencil A — XB £ C™ xn . 

For simplicity we assume only finite eigenvalues exist. After the computation of 
eigenvalues Ai (1 < i < n) and eigenvectors (both left and right) one can normalize 
the eigenvectors to get X, Y £ <C nxn such that 



/A, 



Y H AX{= A) 



ei.2 



ei, n \ 



62.1 A2 



diag{Ai, • • • , A„} + E, 



Y H BX(= B) 



' 1 



h. 



fl,2 
1 



^n- l,n 
fl,n \ 



fn 



fn — 1 , n 

1 



F. 



The matrices E and F represent the errors, which we expect to be small after 
a successful computation (note that in practice computing the matrix products 
Y AX, Y BX also introduces errors, but here we ignore this effect, to focus on 
the accuracy of the eigensolver) . We denote by Ej = \ejA and Fj = J2i \ fj,i\ 
(1 < j < n) their absolute jth row sums. We assume that Fj < 1 for all j, or 
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equivalcntly that I + F is strictly diagonally dominant, so in the following we only 
consider T S (A,B) in Corollary 12.51 and refer to it as the Gerschgorin disk. 

Wc note that the assumption that both eigenvalues and eigenvectors are com- 
puted restricts the problem size to moderate n. Nonetheless, computation of a 
full cigcndccomposition of a small-sized problem is often necessary in practice. For 
example, it is the computational kernel of the Rayleigh-Ritz process in a method 
for computing several eigenpairs of a large-scale problem [TJ Ch. 5]. 

Simple bound. For a particular computed eigenvalue Xi, we are interested in how 
close it is to an "exact" eigenvalue of the pencil A — XB. We consider the simple 
and multiple eigenvalue cases separately. 

(1) When Xi is a simple eigenvalue. We define S = mhx,yj |Aj — Xj\ > 0. If E 
and F are small enough, then Ti(A, B) is disjoint from all the other n — 1 
disks. Specifically, this is true if 5 > pi + pj for all j ^ i, where 

(AU _ iXilFj + Ej _ \\i\F j+ Ej 

{ > Pl ~ l-F ' Pj ~ \-F,j 

are the radii of the ith and jth Gerschgorin disks in Theorem 12.31 respec- 
tively. If the inequalities are satisfied for all j i, then using Theorcm l2.11l 
we conclude that there exists exactly 1 eigenvalue A, of the pencil A — XB 
(which has the same eigenvalues as A — XB) such that 

(4.2) \Xi-Xi\<pi. 

(2) When A, is a multiple eigenvalue of multiplicity fc, so that A; — Aj+i — 
••• = Aj_|_/-_i. It is straightforward to see that a similar argument holds 
and if the k disks Ti + i(A, B) (0 < Z < fc — 1) are disjoint from the other 
n — k disks, then there exist exactly k eigenvalues Xj [i < j < i + k — 1) of 
the pencil A — XB such that 

(4.3) \Xj — XA < max p,+i. 

0<l<k 

Tighter bound. Here we derive another bound that can be much tighter than 
(|4.2p when the error matrices E and F are small. We use the technique of diag- 
onal similarity transformations employed in [12l [10] , where first-order eigenvalue 
perturbation results are obtained. 

We consider the case where A^ is a simple eigenvalue and denote 5 = min^ | A^ — 
Xj \ > 0, and suppose that the ith Gerschgorin disk of the pencil A — XB is disjoint 
from the others. 

Let T be a diagonal matrix whose ith diagonal is r and 1 otherwise. We consider 
the Gerschgorin disks Tj(TAT~ 1 ,TBT~ 1 ), and find the smallest r such that the 
ith disk is disjoint from the others. By the assumption, this disjointness holds when 
r = 1, so we only consider r < 1. 

The center of T :j (TAT- 1 ,TBT- 1 ) is A, for all j. As for the radii p t and pj, for 
t < Fi, Fj we have 

_ T\Xi\Fi + TEi 

Pi = — ; s < Tpi, 

1 - rFi 



and 
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Since r < 1, we see that writing Sj — | A» — Aj|, 

(4.4) Pl + pj < Sj 

is a sufficient condition to for the disks T^TAT' 1 , TBT' 1 ) and T J (TAT~ 1 ,TBT~ 1 ) 
to be disjoint. (|4.4|) is satisfied if 

Pl+ 1-F j/T <dj 
- FjWj - Pi ) > \\AFj + Ej 

I A, IF, + F, 
Sj - p l 

where we used Sj — pi > 0, which follows from the disjointness assumption. Here, 
since Sj > S > pi, we see that (|4.4[) is true if 

^ A, \Fj + F 7 
t > Fj + °-. 

5 - pi 

Repeating the same argument for all j ^ i, we conclude that if 

(4.5) r>f+ ^M+M (ETo)i 

5 - Pi 

then the disk Ti(TAT~ 1 ,TBT~ 1 ) is disjoint from the remaining n — 1 disks. 

Therefore, by letting r = tq and using Theorem 12.111 for the pencil TAT^ 1 — 
\TBT~ 1 , we conclude that there exists exactly one eigenvalue Xi of the pencil 
A — XB such that 

(4.6) |A i -A i |<^^±^l<r p, 

1 - ToF 

Using S < | Ai| + |Aj|, we can bound To from above by 

max j¥ , { (2 1 Aj | + | Aj | ) Fj + Fj } max j¥i {(2|Aj| + \Xi\)Fj + Fj} 
T °- <5-p, " (l-Fi)(S- P i) 

Also observe from (14.11) that 



|A. t |F + F, max 1 < J < 1I {(2|A J -| + |A t |)Fj+Fj} 



l-F " l-F 

Therefore, denoting 5' = 5 — pi and r = maxi<j<„{(2 Aj | + |Aj|)Fj + Fj}, 
we have tq < r/S' and pi < r. Hence, from (|4.6|) we conclude that 

(4-7) \Xi-Xi\<j. 

Since r is essentially the size of the error, and 8' is essentially the gap between A^ and 
any other computed eigenvalue, we note that this bound resembles the quadratic 
bound for the standard Hcrmitian eigenvalue problem, | A — A | < ||F|| 2 /<5 [7J Ch.ll]. 
Our result (|4.7| indicates that this type of quadratic error bound holds also for the 
non-Hcrmitian generalized eigenvalue problems. 
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